Background
With all of the new hype surrounding RSS and weblogs, I decided to create a
simple RSS feed for my site. Along the way I ran into a couple excentricities
that I thought I would share with my fellow CP'ians. If you would like to see
this code in action, check out.
I started out by locating a RSS 2.0 specification, which can be found here. I won't go
into an explanation of the spec, because they do a better job themselves than I
could. Once I had reviewed that, I went out looking for an existing example. I
found a few, but they all used a StringBuilder
to generate
the actual XML. This can lead to problems with invalid characters and whatnot,
so I decided to do an article using XmlTextWriter
.
Implementation
Once I had decided to use the XmlTextWriter
, I jumped
right into coding. I started out by writing a method to write the prologue, or
header, of the RSS document.
public XmlTextWriter WriteRSSPrologue(XmlTextWriter writer)
{
writer.WriteStartDocument();
writer.WriteStartElement("rss");
writer.WriteAttributeString("version","2.0");
writer.WriteAttributeString("xmlns:blogChannel",
"http://backend.userland.com/blogChannelModule");
writer.WriteStartElement("channel");
writer.WriteElementString("title","Simple RSS Document");
writer.WriteElementString("link","http://www.danielbright.net/");
writer.WriteElementString("description",
"A simple RSS document generated using XMLTextWriter");
writer.WriteElementString("copyright","Copyright 2002-2003 Dan Bright");
writer.WriteElementString("generator","RSSviaXmlTextWriter v1.0");
return writer;
}
As you can see, you pass this method the XmlTextWriter
instance that you want to use and it will return it with the header information
written. Simple enough.
Now we move on to adding <item>
's to the
document. Once again I reviewed the spec. Everything looked fairly straight
forward until I came to the <pubDate>
element.
"All date-times in RSS conform to the Date and Time Specification of
RFC 822, with the
exception that the year may be expressed with two characters or four characters
(four preferred)."
This had me stumped. I spent about 20 minutes trying to roll my own DateTime
that was RFC 822 compliant, and then inspiration struck.
Google. After a quick search I discovered that Martin Gudgin
had faced the same problem, and found a solution. It seems (that if I had read the DateTime.ToString()
documentation I would have known) passing the
r
parameter will get us the RFC 822 date we need. Nice.
public XmlTextWriter AddRSSItem(XmlTextWriter writer,
string sItemTitle, string sItemLink,
string sItemDescription)
{
writer.WriteStartElement("item");
writer.WriteElementString("title",sItemTitle);
writer.WriteElementString("link",sItemLink);
writer.WriteElementString("description",sItemDescription);
writer.WriteElementString("pubDate", DateTime.Now.ToString("r"));
writer.WriteEndElement();
return writer;
}
This time we pass the method the XmlTextWriter
, along
with the actual parts of the <item>
we wish to
create.
Note: It will generate the <pubDate>
element for us using DateTime.Now
, but this is just an
example. If you were to load this document into an RSS aggregator the items
would appear new each time the document was requested, quickly filling up the
aggregator with duplicate entries.
Now we have to cleanup behind our RSS document, again passing the XmlTextWriter
to be used.
public XmlTextWriter WriteRSSClosing(XmlTextWriter writer)
{
writer.WriteEndElement();
writer.WriteEndElement();
writer.WriteEndDocument();
return writer;
}
Once we have this foundation in place, we move on to actually creating a RSS
document.
This is where I ran into my second problem. I had been using a StringWriter
as the buffer to store my XmlTextWriter
in. This is fine except for the fact that StringWriter
s are hard-coded to use UTF-16 encoding. This isn't a
problem at all, unless you try to open our RSS document using Internet Explorer.
The internal style sheet that IE uses to display XML does not like UTF-16
apparently, and will not display the document.
Another Google led me to a post
on Roy Osherove's weblog discussing this very thing. A big thanks goes out to
Stephane, who pointed out that a MemoryStream
is the way to
go with this one.
private void Page_Load(object sender, System.EventArgs e)
{
XmlTextWriter writer = new XmlTextWriter(Response.OutputStream,
System.Text.Encoding.UTF8);
WriteRSSPrologue(writer);
AddRSSItem(writer,"Item Title","http://test.com", "This is a test item");
AddRSSItem(writer,"Item 2 Title", "http://test.com/blabla.aspx">http:,
"This is the second test item");
AddRSSItem(writer,"<b>Item 2 Title</b>", "http://test.com/blabla.aspx">http:,
"This is the second test item");
WriteRSSClosing(writer);
writer.Flush();
writer.Close();
Response.ContentEncoding = System.Text.Encoding.UTF8;
Response.ContentType = "text/xml";
Response.Cache.SetCacheability(HttpCacheability.Public);
Response.End();
}
And there we have it. This will now output valid RSS, even
when invalid characters are passed thanks to the XmlTextWriter
.
History
- Update: 15 Sep. 2003
- I have used a
Response.OutputStream
instead of the
MemoryStream
as suggested in the comment below. This fixed 2 small
bugs I had found. Previously the stylesheet used by IE would not properly render
the page if more than 5 items were listed. This is resolved. Also, IE wouldn't
render the page properly if the description field was written as CDATA. This is
also resolved.
- I have added an overload to
AddRSSItem
that allows you to write
the description as CDATA
via a boolean
.